Bandit Forest

نویسندگان

Raphael F'eraud

Robin Allesiardo

Tanguy Urvoy

Fabrice Cl'erot

چکیده

To address the contextual bandit problem, we propose online decision tree algorithms. The analysis of proposed algorithms is based on the sample complexity needed to find the optimal decision stump. Then, the decision stumps are assembled in a decision tree, Bandit Tree, and in a random collection of decision trees, Bandit Forest. We show that the proposed algorithms are optimal up to a logarithmic factor. The dependence of the sample complexity upon the number of contextual variables is logarithmic. The computational cost of the proposed algorithms with respect to the time horizon is linear. These analytical results allow the proposed algorithms to be efficient in real applications, where the number of events to process is huge, and where we expect that some contextual variables, chosen from a large set, have potentially non-linear dependencies with the rewards. When Bandit Trees are assembled into a Bandit Forest, the analysis is done against a strong reference, the Random Forest built knowing the joint distribution of contexts and rewards. We show that the expected dependent regret bound against this strong reference is logarithmic with respect to the time horizon. In the experiments done to illustrate the theoretical analysis, Bandit Tree and Bandit Forest obtain promising results in comparison with state-of-the-art algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Random Forest for the Contextual Bandit Problem

To address the contextual bandit problem, we propose an online random forest algorithm. The analysis of the proposed algorithm is based on the sample complexity needed to find the optimal decision stump. Then, the decision stumps are recursively stacked in a random collection of decision trees, BANDIT FOREST. We show that the proposed algorithm is optimal up to logarithmic factors. The dependen...

متن کامل

Adaptive play in Texas Hold'em Poker

We present a Texas Hold’em poker player for limit headsup games. Our bot is designed to adapt automatically to the strategy of the opponent and is not based on Nash equilibrium computation. The main idea is to design a bot that builds beliefs on his opponent’s hand. A forest of game trees is generated according to those beliefs and the solutions of the trees are combined to make the best decisi...

متن کامل

Boosting with Online Binary Learners for the Multiclass Bandit Problem

We consider the problem of online multiclass prediction in the bandit setting. Compared with the full-information setting, in which the learner can receive the true label as feedback after making each prediction, the bandit setting assumes that the learner can only know the correctness of the predicted label. Because the bandit setting is more restricted, it is difficult to design good bandit l...

متن کامل

Contributions to the Asymptotic Minimax Theorem for the Two-Armed Bandit Problem

The asymptotic minimax theorem for Bernoully twoarmed bandit problem states that the minimax risk has the order N as N → ∞, where N is the control horizon, and provides lower and upper estimates. It can be easily extended to normal two-armed bandit. For normal two-armed bandit, we generalize the asymptotic minimax theorem as follows: the minimax risk is approximately equal to 0.637N as N →∞. Ke...

متن کامل

The bandit, a New DNA Transposon from a Hookworm—Possible Horizontal Genetic Transfer between Host and Parasite

BACKGROUND An enhanced understanding of the hookworm genome and its resident mobile genetic elements should facilitate understanding of the genome evolution, genome organization, possibly host-parasite co-evolution and horizontal gene transfer, and from a practical perspective, development of transposon-based transgenesis for hookworms and other parasitic nematodes. METHODOLOGY/PRINCIPAL FIND...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

Bandit Forest

نویسندگان

چکیده

منابع مشابه

Random Forest for the Contextual Bandit Problem

Adaptive play in Texas Hold'em Poker

Boosting with Online Binary Learners for the Multiclass Bandit Problem

Contributions to the Asymptotic Minimax Theorem for the Two-Armed Bandit Problem

The bandit, a New DNA Transposon from a Hookworm—Possible Horizontal Genetic Transfer between Host and Parasite

عنوان ژورنال:

اشتراک گذاری